improve KVM hosts reset (by not rebooting in script)#2658
improve KVM hosts reset (by not rebooting in script)#2658khmarochos wants to merge 1 commit intoapache:masterfrom khmarochos:fix_issue_2657
Conversation
|
But will this work? A hanging NFS mount will put those procs in status 'D' and they can't be killed. Not even with -9. |
|
Well, what we need is to fence a failed node, so it's better to break some virtualization processes than to reboot the whole host that might be carrying a lot of other working guests on other primary storages. That exactly was what I had a day before I proposed the change we're discussing.
As an alternative, we can add some blocking rule in iptables that will make the NFS-target totally unreachable (to make sure that the host could not write anything there).
…On Jun 7, 2018, 09:41, at 09:41, Wido den Hollander ***@***.***> wrote:
But will this work? A hanging NFS mount will put those procs in status
'D' and they can't be killed. Not even with -9.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#2658 (comment)
|
|
@Melnik13 I understand what you are saying, but a proc in status D won't die. So it's not truly fenced. That's the difficulty. You might want to go for a 'kill -9' just to be sure. |
|
@blueorangutan package |
|
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✖debian. JID-2147 |
|
@blueorangutan test |
|
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-2807)
|
|
@Melnik13 is this PR still valid? |
|
@rafaelweingartner, |
Description
I'd propose to kill the qemu-kvm processes of the VMs whose volumes are located on an NFS storage instead of rebooting the whole host.
Types of changes
GitHub Issue/PRs
Fixes: #2657
Screenshots (if appropriate):
How Has This Been Tested?
Performed tests on a KVM host in my environment, OS is CentOS 7.x.
Checklist:
Testing
Fixes #2657